6 research outputs found
Masquerade attack detection through observation planning for multi-robot systems
The increasing adoption of autonomous mobile robots comes with
a rising concern over the security of these systems. In this work, we
examine the dangers that an adversary could pose in a multi-agent
robot system. We show that conventional multi-agent plans are
vulnerable to strong attackers masquerading as a properly functioning
agent. We propose a novel technique to incorporate attack
detection into the multi-agent path-finding problem through the
simultaneous synthesis of observation plans. We show that by
specially crafting the multi-agent plan, the induced inter-agent
observations can provide introspective monitoring guarantees; we
achieve guarantees that any adversarial agent that plans to break
the system-wide security specification must necessarily violate the
induced observation plan.Accepted manuscrip
Resilience of multi-robot systems to physical masquerade attacks
The advent of autonomous mobile multi-robot systems has driven innovation in both the industrial and defense sectors. The integration of such systems in safety-and security-critical applications has raised concern over their resilience to attack. In this work, we investigate the security problem of a stealthy adversary masquerading as a properly functioning agent. We show that conventional multi-agent pathfinding solutions are vulnerable to these physical masquerade attacks. Furthermore, we provide a constraint-based formulation of multi-agent pathfinding that yields multi-agent plans that are provably resilient to physical masquerade attacks. This formalization leverages inter-agent observations to facilitate introspective monitoring to guarantee resilience.Accepted manuscrip
TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents
Recent work has identified that classification models implemented as
neural networks are vulnerable to
data-poisoning and Trojan attacks at training time.
In this work, we show that these
training-time vulnerabilities extend to
deep reinforcement learning (DRL) agents
and can be exploited by an adversary with access
to the training process.
In particular, we focus on
Trojan attacks that augment the function of
reinforcement learning policies
with hidden behaviors.
We demonstrate that such attacks can be implemented
through minuscule data poisoning (as little as 0.025% of the training data) and
in-band
reward modification that does not affect
the reward on normal inputs.
The policies learned with our proposed attack approach perform imperceptibly similar to benign policies but deteriorate drastically when the Trojan is triggered
in both targeted and untargeted settings.
Furthermore, we show that existing Trojan defense mechanisms for classification tasks are not effective in the reinforcement learning setting
Securing multi-robot systems with inter-robot observations and accusations
In various industries, such as manufacturing, logistics, agriculture, defense, search and rescue, and transportation, Multi-robot systems (MRSs) are increasingly gaining popularity. These systems involve multiple robots working together towards a shared objective, either autonomously or under human supervision. However, as MRSs operate in uncertain or even adversarial environments, and the sensors and actuators of each robot may be error-prone, they are susceptible to faults and security threats unique to MRSs. Classical techniques from distributed systems cannot detect or mitigate these threats. In this dissertation, novel techniques are proposed to enhance the security and fault-tolerance of MRSs through inter-robot observations and accusations.
A fundamental security property is proposed for MRSs, which ensures that forbidden deviations from a desired multi-robot motion plan by the system supervisor are detected. Relying solely on self-reported motion information from the robots for monitoring deviations can leave the system vulnerable to attacks from a single compromised robot. The concept of co-observations is introduced, which are additional data reported to the supervisor to supplement the self-reported motion information. Co-observation-based detection is formalized as a method of identifying deviations from the expected motion plan based on discrepancies in the sequence of co-observations reported. An optimal deviation-detecting motion planning problem is formulated that achieves all the original application objectives while ensuring that all forbidden plan-deviation attacks trigger co-observation-based detection by the supervisor. A secure motion planner based on constraint solving is proposed as a proof-of-concept to implement the deviation-detecting security property.
The security and resilience of MRSs against plan deviation attacks are further improved by limiting the information available to attackers. An efficient algorithm is proposed that verifies the inability of an attacker to stealthily perform forbidden plan deviation attacks with a given motion plan and announcement scheme. Such announcement schemes are referred to as horizon-limiting. An optimal horizon-limiting planning problem is formulated that maximizes planning lookahead while maintaining the announcement scheme as horizon-limiting. Co-observations and horizon-limiting announcements are shown to be efficient and scalable in protecting MRSs, including systems with hundreds of robots, as evidenced by a case study in a warehouse setting.
Lastly, the Decentralized Blocklist Protocol (DBP), a method for designing Byzantine-resilient decentralized MRSs, is introduced. DBP is based on inter-robot accusations and allows cooperative robots to identify misbehavior through co-observations and share this information through the network. The method is adaptive to the number of faulty robots and is widely applicable to various decentralized MRS applications. It also permits fast information propagation, requires fewer cooperative observers of application-specific variables, and reduces the worst-case connectivity requirement, making it more scalable than existing methods. Empirical results demonstrate the scalability and effectiveness of DBP in cooperative target tracking, time synchronization, and localization case studies with hundreds of robots.
The techniques proposed in this dissertation enhance the security and fault-tolerance of MRSs operating in uncertain and adversarial environments, aiding in the development of secure MRSs for emerging applications
TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents
Recent work has identified that classification models implemented as neural
networks are vulnerable to data-poisoning and Trojan attacks at training time.
In this work, we show that these training-time vulnerabilities extend to deep
reinforcement learning (DRL) agents and can be exploited by an adversary with
access to the training process. In particular, we focus on Trojan attacks that
augment the function of reinforcement learning policies with hidden behaviors.
We demonstrate that such attacks can be implemented through minuscule data
poisoning (as little as 0.025% of the training data) and in-band reward
modification that does not affect the reward on normal inputs. The policies
learned with our proposed attack approach perform imperceptibly similar to
benign policies but deteriorate drastically when the Trojan is triggered in
both targeted and untargeted settings. Furthermore, we show that existing
Trojan defense mechanisms for classification tasks are not effective in the
reinforcement learning setting
TrojDRL: evaluation of backdoor attacks on deep reinforcement learning
We present TrojDRL, a tool for exploring and evaluating
backdoor attacks on deep reinforcement learning agents.
TrojDRL exploits the sequential nature of deep reinforcement
learning (DRL) and considers different gradations of threat
models. We show that untargeted attacks on state-of-the-art
actor-critic algorithms can circumvent existing defenses built
on the assumption of backdoors being targeted. We evaluated
TrojDRL on a broad set of DRL benchmarks and showed that
the attacks require only poisoning as little as 0.025% of training
data. Compared with existing works of backdoor attacks on
classification models, TrojDRL provides a first step towards
understanding the vulnerability of DRL agents.Accepted manuscrip